Network-Aware Locality Scheduling for Distributed Data Operators in Data Centers

نویسندگان

چکیده

Large data centers are currently the mainstream infrastructures for big processing. As one of most fundamental tasks in these environments, efficient execution distributed operators (e.g., join and aggregation) still challenging current systems, key performance issues is network communication time. State-of-the-art methods trying to improve that problem focus on either application-layer locality optimization reduce traffic or network-layer flow increase bandwidth utilization. However, techniques two layers totally independent from each other, gains a joint perspective have not yet been explored. In this article, we propose novel approach called NEAL (NEtwork-Aware Locality scheduling) bridge gap, consequently further time operators. We present detailed design implementation NEAL, our experimental results demonstrate always performs better than approaches different workloads configurations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal network locality in distributed virtualized data-centers

Cost efficiency is a key aspect in deploying distributed service in networks within decentralized service delivery architectures. In this paper, we address this aspect from an optimization and algorithmic standpoint. The research deals with the placement of service components to network sites, where the performance metric is the cost for acquiring components between the sites. The resulting opt...

متن کامل

Energy Aware Task Scheduling in Data Centers

Nowadays energy consumption problem is a major issue for data centers. The energy consumption increases significantly along with its CPU frequency getting higher. With Dynamic Voltage and Frequency Scaling (DVFS) techniques, CPU could be set to a suitable working frequency during the running time according to the workload. On the other side, reducing frequency implies that more servers will be ...

متن کامل

Data Locality-Aware Big Data Query Evaluation in Distributed Clouds

With more and more businesses and organizations outsourcing their IT services to distributed clouds for cost savings, historical and operational data generated by the services have been growing exponentially. The generated data that are referred to as big data, stored at different geographic datacenters, now become an invaluable asset to these businesses and organizations, as they can make use ...

متن کامل

Near Data Scheduling for Data Centers with Multi Levels of Data Locality

Data locality is a fundamental issue for data-parallel applications. Considering MapReduce in Hadoop, the map task scheduling part requires an efficient algorithm which takes data locality into consideration; otherwise, system may get unstable under loads inside the system’s capacity region or jobs may experience longer completion times which are not of interest. The data chunk needed for any m...

متن کامل

Locality Aware Task Scheduling in Parallel Data Stream Processing

Parallel data processing and parallel streaming systems become quite popular. They are employed in various domains such as real-time signal processing, OLAP database systems, or high performance data extraction. One of the key components of these systems is the task scheduler which plans and executes tasks spawned by the system on available CPU cores. The multiprocessor systems and CPU architec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Parallel and Distributed Systems

سال: 2021

ISSN: ['1045-9219', '1558-2183', '2161-9883']

DOI: https://doi.org/10.1109/tpds.2021.3053241